Muhammad Haris Rao
wRecall that given a probability space $\left( \Omega, \mathcal{F}, \mathbf{P} \right)$ and random variable $X : \Omega \longrightarrow \mathbb{R}$, the expectation is defined as $$ \mathbf{E} \left[ X \right] = \int_\Omega X \, d\mathbf{P} $$ where the integral is in the Lebesgue sense. We would like to have some computation techniques for evaluating this. Let $P_X : \mathcal{B} \left( \mathbb{R} \right) \longrightarrow \mathbb{R}$ be the induced probability measure on the Borel sets defined at $A \in \mathcal{B}(\mathbb{R})$ by $$ P_X \left( A \right) = \mathbf{P} \left( X^{-1} (A) \right) $$ This is also known as the pushforward measure induced by $X$. We can use the following result:
Theorem: Let $(X, \mathcal{A}, \mu)$ be a measure space, $(Y, \mathcal{B})$ a measurable space, $f : X \longrightarrow Y$ measurable, and $\mu^* : \mathcal{B} \longrightarrow [0, \infty]$ the pushforward measure. Then for any $\phi : Y \longrightarrow \mathbb{R}$ integrable, $$ \int_Y \phi \, d \mu^* = \int_X \phi \circ f \, d \mu $$ Moreover, $\phi$ is integrable with respect to $\mu^*$ if and only if $\phi \circ f$ is integrable with respect to $\mu$.
Corollary: Let $\left( \Omega, \mathcal{F}, \mathbf{P} \right)$ be a probability space and $X : \Omega \longrightarrow \mathbb{R}$ be an integrable random variable. Then $$ \mathbf{E} \left[X\right] = \int_\mathbb{R} x \, d P_X $$
Proof. Let $\text{Id}_{\mathbb{R}} : \mathbb{R} \longrightarrow \mathbb{R}$ be the identity function mapping each $x \in \mathbb{R}$ to itself. If $X$ is integrable, then $\text{Id}_\mathbb{R} \circ X = X$ is integrable. By the above theorem, it follows that $\text{Id}_{\mathbb{R}}$ is integrable on $\mathbb{R}$ with respect to $P_X$ and $$ \int_\mathbb{R} x \, d P_X = \int_\mathbb{R} \text{Id}_\mathbb{R} \, d P_X = \int_\mathbb{R} \text{Id}_\mathbb{R} \circ X \, d \mathbf{P} = \int_\mathbb{R} X \, d \mathbf{P} = \mathbf{E}[X] $$ $\space$ $\blacksquare$
Definition: A random variable $X : \Omega \longrightarrow \mathbb{R}$ is said to be continuous if the induced measure $P_X : \mathcal{B} \left( \mathbb{R} \right) \longrightarrow [0, 1]$ has a density. That is, there exists $f_X : \mathbb{R} \longrightarrow \mathbb{R}$ such that if $A \in \mathcal{B}(\mathbb{R})$ then $$ P_X(A) = \int_A f_X \, d P_X $$
For continuous random variables, we can use the change of variables formula:
Theorem(Change of Variables): If $\mu, \lambda$ are measures on a measurable space $(X, \mathcal{A})$ with $\mu$ having density $\frac{d\mu}{d \lambda} : X \longrightarrow \mathbb{R}_{\ge 0}$, then for any measurable $f : X \longrightarrow \mathbb{R}$ integrable with respect to $\mu$, it holds that $$ \int_X f \, d \mu = \int_X f \cdot \frac{d \mu}{d \lambda} \, d \lambda $$
It follows easily by taking the $\mu, \lambda$ in the statement to be $P_X$ and the ordinary Lebesgue measure respectively that for an inetgrable random variable $X$ and the Borel measure $P_X$, the expectation is $$ \mathbf{E}[X] = \int_\Omega X \, d \mathbf{P} = \int_\mathbb{R} x \, d P_X = \int_\mathbb{R} x f_X (x) \, dx $$
Definition: A random variable $X : \Omega \longrightarrow \mathbb{R}$ is said to be discrete if there exists a countable set $C \subseteq \mathbb{R}$ such that $\mathbf{P}(X \in C) = 1$. Equivalently, $\mathbf{P}\left( X^{-1}(C) \right) = 1$.
For continuous random variables, we can use the change of variables formula:
Theorem: Let $X : \Omega \longrightarrow \mathbb{R}$ be a discrete integrable random variable taking values in a countable $C \subseteq \mathbb{R}$ almost surely. Then $$ \mathbf{E}[X] = \sum_{x \in C} x \mathbf{P}(X = x) $$ with the sum converging absolutely.
Proof. We will evaluate the expectation of $X^+$. If $X$ takes values in $C$ almost surely, then $X^+$ takes values in $C^+ = C \cap \mathbb{R}_{\ge 0}$ almost surely. We have $$ \mathbf{E}[X^+] = \int_\mathbb{R} x \, d P_{X^+} = \int_{C^+} x \, d P_{X^+} + \int_{\mathbb{R} - C^+} x \, d P_{X^+} $$ Since $X^+$ is in $\mathbb{R} - C^+$ with probability 0, we have $P_{X^+} (\mathbb{R} - C^+) = 0$ so that $$ \int_{\mathbb{R} - C^+} x \, d P_{X^+} = 0 $$ So now the expectation is $$ \mathbf{E}[X^+] = \int_{C^+} x \, d P_{X^+} = \sum_{c \in C^+} \int_{\{ c \}} x \, d P_{X^+} = \sum_{c \in C^+} c \mathbf{P} (X = c) $$ By an exactly symmetric argument, $$ \mathbf{E}[X^-] = -\sum_{c \in C^-} c \mathbf{P} (X = c) $$ where $C^- = C \cap \mathbb{R}_{\le 0}$. Since $X$ is integrable, these two sums are finite. Since $C^+, C^-$ are disjoint except at 0, and the 0 point contributes nothing to both sums, we will have $$ \mathbf{E}[X] = \mathbf{E}[X^+] - \mathbf{E}[X^-] = \sum_{c \in C^+} c \mathbf{P} (X = c) + \sum_{c \in C^-} c \mathbf{P} (X = c) = \sum_{c \in C} c \mathbf{P} (X = c) $$ The sum converges absolutely because the positive and negative parts which equate to $\mathbf{E}[X^+]$, $\mathbf{E}[X^-]$ are finite. $\blacksquare$
So for a discrete random variable $X$ taking values in a countable $C$ almost surely, the expectation is $$ \mathbf{E}[X] = \sum_{x \in C} x \mathbf{P}(X = x) $$
Theorem: Let $X : \Omega \longrightarrow \mathbb{R}$ be positive random variable. Then $$ \mathbf{E}[X] = \int_0^\infty 1 - F_X(x) \, dx $$
Proof. See that $$ \int_0^\infty 1 - F_X(x) \, dx = \int_0^\infty \mathbf{P} \left( X > x \right) \, dx = \int_0^\infty \mathbf{E} \left[ \mathbf{1}\{ X > x \} \right] \, dx = \mathbf{E} \left[ \int_0^\infty \mathbf{1}\{ X > x \} \, dx \right] $$ where interchanging expectation and integral is valid by Fubini's theorem since the integrand is non-negative. Since $$ \int_0^\infty \mathbf{1}\{ X > x \} \, dx = X $$ the result follows. $\blacksquare$